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Abstract 

We investigate Abelian primitive words, which are words that are not Abelian powers. We show the set of 
Abelian primitive words is not context-free. We can determine whether a word is Abelian primitive in linear time. 
Also different from classical primitive words, we find that a word may have more than one Abelian root. We also 
consider enumeration of Abelian primitive words. 

1 Introduction 

Repetition in words is a well-studied topic, and many of the results in this area can be classified into two distinct 
research areas: the theory of formal languages and the study of combinatorics on words. In these two areas, the 
focus on repetition is slightly different: in formal language theory, research focuses on the properties of languages 
containing words with different types of repetition, while in combinatorics on words, research typically concentrates 
on the existence or non-existence of individual words which avoid certain repetitions, and combinatorial enumeration 
of words with or without repetitions. 

An example of a long-standing area of research relating to repetition in both the theory of formal languages and 
combinatorics on words are primitive words: a word x is primitive if it cannot be expressed as a repetition of some 
shorter word y. In combinatorics on words, an elegant proof of the number of primitive words of length is given using 
Mobius inversions (see, e.g., Lothaire [ 14)). However, in formal language theory, it is unknown whether the set of 
primitive words are a context-free language or not (see, e.g., Domosi et al. ifTUl ). However, it is known that a closely 
related set, the set of Lyndon words, is not context-free (4). 

In combinatorics on words, a parallel notion to standard repetition is Abelian repetition. A word x is an Abelian 
power if it can be divided into blocks x — x\X2 ■ ■ ■ x n where every block jc, is a permutation of every other block. 

In this paper, we consider the application of Abelian repetition to the concept of primitivity. Despite the naturalness 
of this application, the concept does not appear to have attracted much attention beforeQ. In a related concept, Czeizler 
et al. [7j study repetitions with only limited rearrangement. We study the language of Abelian primitive words, a 
formal language theoretic question, as well as the number of Abelian primitive words of a given length, a problem in 
combinatorics on words. 



2 Definitions 

For additional background in formal languages and automata theory, see Rozenberg and A. Salomaa 1 17]. Let E be a 
finite set of letters, called an alphabet. A string over E is any finite sequence of letters from E. The string containing 
no symbols, the empty string, is denoted e. The set E* is the set of all strings over E. A language L is any subset of 

1 We have found a reference to a research project studying Abelian primitive words on the web at http : / /bit . ly / 9NWqSL but have been 
unable to obtain a copy of any associated works. 
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E*. If x = a\a2 ■ ■ ■ a n is a string, with a, 6 E, then the length of x, denoted by |jc|, is n. For a £ E and w £ E, |w| a is the 
number of occurrences of a in w. 

For languages L\ ,Lq C E* the left quotient of Lj by L2, denoted Lq\L\, is defined by 

Li\L\ = {x e E* : 3y G L2 sucn that yx £ L\ }. 

Given an (ordered) alphabet E = {ai,...,a n }, the Parikh vector of a word w e E* is *P(w) = (|w| fll , |w|a 2 , . . . , |w| a „). 
For the alphabet E = {a,b}, we assume a < b. Thus, for example ^(abbab) = (2,3). 

We first recall the standard notions of primitive words. A word w is primitive if w cannot be written as z k for z £ E* 
and k > 2. If w is not primitive, then there is a unique primitive word u such that w — u for some k > 2. For an 
alphabet E, the set of all primitive words w £ E* is denoted Q(E) or simply Q if E is understood. 

We now turn to the generalization of these notions to Abelian repetitions. A word w is a n-th Abelian power if 
w — u\U2 ■ ■ ■ u„ for some u\,U2, ■ ■ ■ ,u n such that for all 1 < i,j < n, ^(w,) = W(uj), That is, each Uj with j > 2 is a 
permutation of u \ . 

We say that a word w is Abelian primitive (or A-primitive, for short) if w fails to be a £-th Abelian power for every 
k>2. For an alphabet E, the set of all A-primitive words w £ E* is denoted by AQ(L) or simply AQ if E is understood. 

Example 1. The word w = aabbab is A-primitive, while u — aabbabab is not, as u — xy where ^(x) = W(y) = (2,2). 

Let w be an Abelian power. Then we say that an word u is an Abelian root (or A-root) of w if w — uu\U2 ■ ■ ■ u n for 
some mi , . . . , u„ £ E* with = *P(ui) for all 1 < i < n. If w has an A-root u which is also A-primitive, then we say 
that u is an A-primitive root of w. Two A-primitive roots u, v of a word w are distinct if \u\ does not divide |v| or vice 
versa. On the other hand, we note the following simple but useful fact: 

Observation 1. If a word x has an A-root of length k, then x also has an A-root of length k' for all k 1 where k divides 
k' and k' divides n. 

We recall some notation from number theory. Recall that if r,z are integers, r | z denotes that r divides z, i.e., z = rk 
for some k > 0. We say that a set of integers S is division-free if x j y and y\x for all x,y £ S. For all n > 2, let co(n) 
denote the number of prime divisors of n, while 0)'(n) is the number of prime divisors of n with multiplicity!!. Thus, 
if n > 2 and n = p^p" 2 • • • p" k is its prime factorization, then co(n) = k and co'(n) = £* =1 a,-. We also let d(n) be the 
number of divisors of «, i.e., d(n) — nf=i (1 + 

3 Non-context-freeness of AQ 

We now show that the set AQ of all A-primitive words is not context-free. This is in contrast to the set of ordinary 
primitive words Q, for which it is unknown whether they are a context-free language or not. We begin with two 
preliminary propositions. 

Proposition 1. Let p be a prime and x — aabb(ab) p ~ 2 . Then x is A-primitive. 
Proof. Note that |x| = 2p. If x is not A-primitive, then one of three cases occurs: 

(a) x = u 2p for some letter m, 

(b) x = u\U2 ■ ■ ■ Up for words u\, . . . ,u p of length two, or 

(c) x = V\V% for words vi, V2 of length p. 

The first of these possibilities cannot occur, as x contains occurrences of both a and b. The second case is also not 
possible, since if so, we would have u\ — aa and U2 — bb, which do not have matching Parikh vectors. Thus, we must 
have that* = V1V2 for |vi| = | V2 j = P- We have three subcases: 

2 The notation £2(n) is also used for what we call a>'{n), but we reserve CI for denoting asymptotic function growth. Our notation is from Bach 
and Shallit (5). 
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(a) if p = 2, then we are in the previous case, i.e., Vi = aa. 

(b) if p = 3, then v\ = aab and V2 = bab. 

(c) otherwise p > 3 and Vi = aabb(ab)^ p ^l 2 a which has Parikh vector ((/» — 5)/2 + 3, (p — 5)/2 + 2), and V2 = 
b{ab) { -P-^l 1 which has Parikh vector ({p - l)/2, (/> - l)/2 + 1). We can see that the number of occurrences of 
a in vi is even, while in V2 it is odd or vice versa. 

□ 

Proposition 2. Let M = AQ(laabb(ab)*. Then 

M = {aabb(ab) p ~ 2 : p is prime. }. 

Proof. The right-to-left inclusion is immediate from Proposition^ 

For the reverse inclusion, let Then |jt| = 2n for some n > 2. Suppose, contrary to what we want to prove, 

that x is not of the form aabb(ab) p ~ 2 for some prime p. Then we must have that n is not prime. Let q be a prime factor 
of n and note that 

X = (aabb(aby- 2 ) • ((afo) (/ )" A/_1 

and that all factors of length 2q have q occurrences of a and q occurrences of b. Further, aabb{ab) q ~ 2 is an A-primitive 
root by PropositionQ] □ 

We can now show that the set of all A-primitive words is not context-free. 
Theorem 1. The set AQ is not context-free. 

Proof. We prove that M is not context-free. Let M' = hr ({aabb} \ M) where h : {a}* —> {a,b}* is the morphism 
h(a) = ab. Then M 1 = {a p ~ 2 : p is prime }. As the context-free languages are closed under quotient by regular sets 
and inverse homomorphism, M' is context-free if M is. But as M' is unary, if it is a context-free language then it is also 
regular. But by the pumping lemma, we can see that M' is not regular. Thus, neither are M or AQ. □ 

The set of all non-trivial Abelian powers, AQ, is also non-context-free, as can be seen through, e.g., the intersection 

AQ(la*ba*ba*b = {a"ba"ba n b : n > 0}. 

For discussion on the complexity of the language of marked Abelian squares and its relation to iterated shuffle and 
deletion operations, see Domaratzki (9] Sect. 8.4.1] and J§drezejowicz and Szepietowski ITT31 Ex. 3.2]. Using the 
interchange lemma, Gabarro [11] has proven that the language {uw\W2V : u,W\,W2,v € E*, , i'(wi) = l P(w2)} of 
words containing an Abelian square is not context-free. 

4 Complexity of AQ 

Through an elegant pattern matching algorithm [15 Thm. 13], it is known that we can determine whether a word is 
primitive in linear time. We now consider this problem for A-primitive words. Throughout this section, we consider 
the size of the alphabet to be a fixed constant. In order to illustrate the basic principles of the algorithm, we begin with 
an (9(nlog«/loglogn) algorithm: 

def isAbelPrim ( w) : 
n = len (w) 
if n==l: 

return True 
PF = { p : p is prime, p | n } 
D = { n/p : p in PF } 
for d in D : 

if w has an A-root of length d: 
return False 
return True 
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Suppose that w € AQ. Then w certainly does not have a A-root whose length is any the periods in D, thus 
isAbelPrim returns true. On the other hand, if w ^ AQ with |w| > 1, then w has an A-root of length r for some r \ n 
with r <n. There exists d r E D such that r\d r (r may also divide other d E D, but it is enough to know it divides some 
df). By ObservationQ] on the loop of isAbelPrim with d = d r , the algorithm will return false. 

One iteration of the loop in isAbelPrim will take time 0(n), by walking across w and computing the Parikh 
vectors for each block of length d E D. Thus, the runtime of the algorithm is 0(p(n) + nco(n)) where p(n) is the time 
required to calculate the set PF. 

We claim that even using trial division (rather than more complex methods such as, e.g., general number field sieve 
(6)), we have p(n) E 0{^/n\ogn). Consider the following algorithm: 

def PF (n) : 
pf = [] 
while True: 
P = 2 

found = False 

while ( p <= math . ceil (math . sqrt (n) ) and (not found)): 
if (n % p == 0) : 
found = True 
pf . append (p) 
while (n % p == 0) : 
n /= p 
p += 1 
if (not found) : 
break 
if (n != 1) : 

pf . append (n) 
return pf 

The method PF calculates the prime factors of n by repeatedly finding the least prime p dividing n and factoring 
out the largest power of p a which divides n. Then this process is repeated on nj p a . 

As for the running time of PF, let n = IlLi P?' ^ e tne prime factorization of n. The outer while loop executes 
<X)(n) = k times, once for each pi dividing n, while one execution of the inner two while loops takes 0(y/n + 05,-) time. 
Thus, the total run-time is 0(t p .\ n y/n + a,-) = 0{*fn(o(n) + ©'(«)). As a>(n) E O (log nj log log n) E Thm. 8.8.10] 
and *»'(») E O(logn) [12, Sect. 22.10], this gives the claimed worst case running time for PF. 

Thus, the running time of is Abe IP rim is O(nco(n)). Using the same estimate on the worst-case growth of 0)(«), 
we obtain the following result: 

Theorem 2. Given x, there is an algorithm to determine if x E AQ which runs in time O(n ^°f" on ) time in the worst 
case. 

For space complexity, we briefly note that the set is in DSPACE(log(n)). To see this, if we are testing whether 
a word is of the form u\U2 ■ ■ ■ u„ where ^(m,) = for all i,j, we can use log-sized counters to keep track of the 

current prefix length block number j (2 < j < n) and the values of the Parikh vectors for u\ and u,-. Viewing the 
alphabet size as constant, this is a constant number of counters. 

4.1 A linear time algorithm for recognizing AQ 

We can improve the algorithm isAbelPrim from the previous section by caching commonly used Parikh vectors, 
and obtain a linear time algorithm. Let gpf(n) be the greatest prime factor of n. Then we note that if gpf(n) 2 | n, every 
d E D is divisible by gpf(n), while if gpf(n) 2 \ n, then every d ED is divisible by gpf(n) except d = n/gpf(n). In both 
cases, we will precompute the Parikh vectors of length gpf(n) in order to compute the Parikh vectors of length d for 
all d E D which are divisible by gpf(n). 
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Let w be our input word of length n and write w — w\W2 • • ■w n l^f(n) where each block has length gpf(n). Let 
u, = *P(wi) for 1 < i < n/gpf(n). Note then that if gpf(n) | d, then the blocks of w of length d have Parikh vectors of 
the form 

d/gpf(n) 

L u W/gpf («)+; 
7=1 

for some 1 < k < gpf(n). Thus, we can compute these Parikh vectors quickly by summing the precomputed u,. 

def isAbelPrimLin (w) : 
n = len (w) 

PF = { p : p is prime, p|n } 
gpf = max(PF) 
D = { n/p : p in PF } 
if ( n % (gpf**2) != 0) : 
D . remove (n/ gpf) 

calculate Parikh vectors of length n/gpf. 

if w has an A-root of length n/gpf: 
return False 
for i in range ( , n/gpf (n) ) : 

u[i] = Parikh (w, i, gpf (n) ) 
for d in D: 

calculate Parikh vectors of length d (using u[i]) 
if w has an A-root of length d: 
return False 
return True 

Here, we let P a r i kh ( w , i , j ) be a method which computes the i -th Parikh vector of length j in the word w. 

This modified implementation has the same correctness as the previous implementation, as the same tests are 
performed. We now show the claimed 0(n) run time. Computing D and PF is the same as in isAbelPrim and can 
be done in linear time. Having computed PF, the calculation of gpf(n) takes 0((o{n)) = O (log nj log log n) time. In 
the case where gpf(n) 2 \ n, the time to execute the additional statements is 0{n) time. Similarly, the computation of 
the Parikh vectors u [ i ] takes time 0(n). 

Consider the execution of the final for loop. For d € D, we need 0(d /gpf '(«)) time to compute one Parikh vector 
of a subword of w of length d, so to compute all n/d such vectors requires time (9(n/gpf(n)). To test the equalities of 
all these n/d vectors (implied by the if statement) requires time 0(n/d) — O(p) where d = n/p. Thus, the worst case 
running time of the loop is 

We now estimate the first quantity. 

Proposition 3. For all integers n, co(n) / gpf{n) < 2/3. 

Proof. Note that if o(«) = r for some integer r, then gpf(n) > p r (where p r is the r-th prime), since if n has r prime 
factors, the minimum possible value for its largest prime factor is p r . A simple induction proves that p r > 2r — 1 for 
r > 2. Thus, o(«)/gpf(n) is maximized by x/(2x— 1) for all n with at least two prime factors. But jjhy is maximized 
at n = 2 on the interval n > 2. Thus, 0)(rc)/gpf(n) < 2/3 for all n with at most two prime factors. For prime powers, 
ffl(n)/gpf(n) < l/n<2/3. □ 

Finally, we have that Y^ p yp < n. Thus, the total running time of the loop is 0(n). 

Theorem 3. Given x, there is an algorithm to determine if x e AQ which runs in time 0(n) time in the worst case. 
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5 Words with multiple A-primitive roots 



We show that unlike classical primitive words, a word may have multiple distinct A-primitive roots. This fact was 
essentially noted by Constantinescu and Hie [5 1 who constructed an infinite word w with two distinct Abelian periods. 
We generalize this to show that for all n > 2, we can construct a word with n distinct A-primitive roots. 

For all n > 1, let Q„ = 2 ■ Y\'i=\ Pi, where p; is the z'-th prime for i > 1, with p\ = 2. Then for all n > 1, let w n be 
the word defined by 

w„ =aabb(ab) iQ "- 4)/2 . 

Note that | w n | = Q„ . For example, 

w>2 = aabbabababab. 

Lemma 1. For all n>2, the word w„ has n distinct A-primitive roots. In particular, the words 

r m = aabb(ab) Pm ~ 2 

for all n > m > 1 are A-primitive roots ofw n . 

Proof. First, note that ab is not an A-primitive root of w n , as it is not a prefix of w„. 

Let 1 < m < n. Then the first subword of w„ of length 2p,„ is r m . All subsequent subwords of w„ of length 2p m are 
(ab) Pm . All subwords have Parikh vector (p m ,Pm)- 

Finally, note that the lengths of r m form a division-free set {2p n . 2 <m< n}. Thus, all r m are A-primitive roots 
of w„ . □ 

The following lemma shows that a word may not have A-primitive roots whose lengths are coprime. 

Lemma 2. If w has two distinct A-primitive roots u,v where \u\ = t\, \v\ = £2, then gcd(£\, £2) > 2. 

Proof. Assume that w has two distinct A-primitive roots as above: w = u\U2 ■■■u m and w = V\V2- • • v„ where |m,| = l\, 
|v/| =£2- Assume, contrary to what we want to prove that gcd(£\ ,£2) = 1. 

First note that m > £2- To see this, note that |w| = m£\ = n£2 and we have that £2 \ £\m. If m < £2, and as £\ and £2 
are coprime, £2 \ £\m is a contradiction. 

Thus m > £2 and n > £\ as well. As gcd(^i .£2) = 1, there exist r,s > such that rl\ = s£2 — 1 (or rl\ = s£,2 + 1, 
which is proven similarly). As m > £2 and n > £\, we can assume that s < n and r < m. 

Thus, the prefix v' — v\V2 ■ ■ ■ v s of w of length s£2 is one letter longer than the prefix u' — u\U2 ■ ■ ■ u r . Without loss of 
generality, let a be the last symbol of v s , which is also the first symbol of u r+ \. Let a = \u\ \ a and j3 = |vi \ a - Counting 
the occurrences of a in u' and v', we get 

ra = sp-l. (1) 

Now consider that the prefix of w of length £{£2 is u\ ■ ■ ■ = v\ ■ ■ ■ Vf { . Considering v" = v s+ i • • • V( { and u" = 
u r +\ ■ ■ ■ u( 2 , and again counting the occurrences of a, we also have 

{t 2 -r)a=(£ 1 -s)p + l. (2) 
Equating both (Q~|) and (O in terms of a, we get 

r(y 1 -s)P + l) = (t 2 -r)(sP-l). 

Solving for gives j3 = £2- Thus, we have that v\ £ a + and thus w only has A-primitive root a, a contradiction. Thus, 
gcd(^ 2 )>2. ' ' □ 



6 Number of Abelian Primitive Roots 

We now turn to the number of A-primitive roots a word may have, as a function of its length. As shown in the previous 
section, for any n, we can construct a word with n A-primitive roots. In this section, we improve this to give a tight 
bound on the number of A-primitive roots a word may have. 
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6.1 Upper Bound 

We first give an upper bound on the number of A-primitive roots a word may have. We need an estimate d(n) 0. 
Theorem 4. The function d(n) satisfies d(n) G o(2 lo §»/ lo g lo g"). 

We will also use a result by de Bruijn et al. [8 1 (see also Anderson 0): 
Theorem 5. Let n — p" 1 p" 2 ■ ■ ■ p^ k be the prime factorization ofn > 2. Let D(n) be the set of integers defined by 

D{n) = -pl k : V/(ft < a,) and £ ft = |fl>'(n)/2j}. 

i=i 

Then D{n) is a maximal anti-chain in the divisor lattice ofn. 

In other words, D(n) is the largest division-free set of divisors of n. Anderson [2j gives the following estimate on 
the size of D(n), which we denote s(n): 

Theorem 6. Let n = p" 1 p" 2 ■ ■ ■ p" k be the prime factorization ofn > 2. Let A(n) = 5 Lf =1 0£,(a,' + 2). Then the 
maximal anti-chain in the divisor lattice ofn has size s(n) = &{d(n) / y/A(n)). 

Now a word w of length n has at most |D(n)| A-primitive roots: if r is an A-primitive root, then \r\ divides \w\ and 
\r\ is not divisible by the length of any other A-primitive root. Thus, we can obtain the following result: 

Theorem 7. Ifw is a word of length n, the number of distinct A-primitive roots is s(n) € o(2 log "/ loglog "). 

Proof. By Theorem[5] if w is a word of length n, then w has at most s(n) distinct A-primitive roots. By Theorem|4] 
and Theorem|6l£/(n)eo(2 ,0 «"/ loglog (")). Thus, the result follows. ' □ 

We can use a result of Anderson [1] which gives the average order of s(n): 

Theorem 8. As co'(n) — y °°, we have 

. , d(n) 

s{n) < ' ■■' " ' 

As n — > 00, 

^ d(m) 

ht n ^0)'{m) V21oglogn' 

6.2 Lower Bound 

For a lower bound on the number of A-primitive roots a word may have, we give an explicit construction. For any 
n > 2, let T(n) = {kd : k 6 N,d € D(n),kd < «}. Let fi < ti < • • • < t mn = n be the m„ elements of T(n) in sorted 
order. Define 

m n 

Zn=a h b h X\a t <- , '-'b t '- t >-' . 

i=2 

Note that z„ is a word of length 2n with ^(zn) = (n,n). 
Example 2. If n = 30, then£>(30) = {2,3,5}. In this case 

r(30)= {2,3, 4,5,6,8,9, 10, 12, 14, 15, 16, 18,20,21, 22,24,25, 26,27,28,30}. 

With this, we have 

Z n = aabbababababaabbababaabbaabbababaabbaabbababaabbababababaabb. 
Lemma 3. Let n>2 and t 6 D(n). Then z n has an A-primitive root of length 2t. 
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Proof. Let 1 < j < m n be the index such that t = tj. As t 6 D(n) C r(n), we have that the prefix of z„ of length 2f is 

w n = a n b' 1 a' 2 '' 1 b h -- h ■ ■ ■ a'-'i-'b'^J- 1 . 
Note that ^(w,,) = (f,f). Now, each additional block of length 2t from has the form 

a ta b' a ■■■a'Pb'l 3 

for some a, j3 which are differences of successive f;. To see this, note that these factors of z n begin and end at positions 
which are multiples of t £ D(n), so each of the breakpoints are elements of T(n). By telescoping, each of these factors 
has Parikh vector (f ,f )■ Thus, z n is a n/t-th A-power. 

Further, w n must be an A-primitive root of z n . Otherwise, there is some z £ T(n) such that z \ t, but in this case, z 
is divisible by some element in D(n), by definition of T(n). But this gives a contradiction, since t € D(n) and D(n) is 
an anti-chain of divisors. □ 

Corollary 1. For all n > 2, there exists a word of length 2n with s(n) distinct A-primitive roots. 

7 Counting Abelian Primitive Words 

Let y/kin) be the number of primitive words of length n over a ^-letter alphabet, Yk( n ) ^ e t ^ le num ber of A-primitive 
words of length n over a fc-letter alphabet and Aj.(n) = %(«) — \]/f(n). Note that A^n) > for all n, but we can 
observe, e.g., that A^p) = for all primes p. Small values of ^{n) are given in FigureQ] 
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Figure 1: Number of A-primitive words V^fn) by length (n) and alphabet size (k). 
The function y/fc(n) is well-known (see, e.g., Lothaire [ 14|). The formula 

Wk {n)=^{d)k nld 

d\n 
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expresses y/j. in terms of the Mobius function fi defined by ju(l) = 1, fi(n) = (— l) k if n is a product of k distinct 
primes and = if p 2 \ n for some prime p. 

We can characterize A# for prime powers exactly: 

Lemma 4. For all primes p and all r>2, 



W) = 




Here, the sum is taken over all partitions n\-\-n2~\ Vn^ofp'' . 

Proof. Let x 6 2 — Ag of length p r . As jc is not A-primitive, it has a A-primitive root of length p' for some 1 < i < r. 
But then x can also be written as x = X\Xi ■ ■ -x p where = p and *P(jCj) = ^(xj) for all 1 < i,j < p. Thus, it 
suffices to count only those x of this form. 

Consider that there are ( ni P ni n ) different words x\ of length p r ~ l such that ^(xi) = («i ,«2, • • • , n k) f° r eacn 

partition «i +«2H h«/t = «. As recently noted by Richmond and Shallit [ 16 1, for a fixed choice of x\, the remainder 

of the words X2, . . . ,x p must satisfy *¥(xj) = W(x\), which can be done in ( ni P n7 n ) ways for each 2 < j < p. Thus, 

we get a total of ( nj „ 2 n ) possibilities, and we must exclude the choice x\ =X2 = JC3 = ••• = x p , as this word is 
not primitive. 

Thus, multiplying the number of choices of the word x\ and the words X2, ■ ■ ■ ,x p and summing over all possible 
Parikh vectors, we get the result. □ 

The problem of giving a closed form of Aj(/i) or y£{ n ) f° r a ^ values of n is still open. 

8 Equivalence Relations on A-primitive words 

In this section, we consider classical results such as the Lyndon-Schiitzenberger Theorem for classical words in the 
context of Abelian primitivity. To do so, we define an appropriate equivalence relations to replace equality. 

We first note that the A-primitive words are not closed under conjugation. For example, note that bbababaa £ AQ 
but aabbabab (jt AQ. Because of this, the concept of a Lyndon-type word for A-primitive words is not a straight 
forward definition (recall that a primitive word w is a Lyndon word if it is the lexicographically least word in its class 
of conjugates). 

For all n > 1, let ~„ be the binary relation defined on words by « ~„ 1 if we can write u — a\OL2 • • • (X,„ and 
x = Pi fh ■ ■ ■ Pm where 

(a) for all 1 < i < m, |a,| = = n. 

(b) for all 1 < i,j < m, ¥(<*,■) = S?(J3 ; -). 

Thus, ~„ represents that two words can be broken into blocks of length n, all of which have the same image under 

Example 3. Let n = 3. Then abcacbabc ^3 cbabcabca as each block b of length three in both words satisfies 
= (1,1,1). 

We use ~„ to investigate relationships with the theory of codes in the context of commutation. 
Theorem 9. For all words u,x e £*, ux ~„ xu if and only if there exists r> 1, a,\ , . . . , a,, j3i , . . . , j3 r £ E* such that 

(a) for all 1 < i < r, |a,j3,| = n. 

(b) for all 1 < ij < r, ^(a,) = S?(a/) and^{p t ) = >P(/3 ; ). 

(c) there exists 1 < s < r such that u = (X\fi\ ■ ■ ■ CC s -ili s -iCt s and x = f} s a s +\fi s +i ■ ■ ■ tt,p r . 
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Proof. (-4=) Let u,x satisfy the conditions. Then we have that 
ux = a\ j3i • • • a r j3 r 

xu = (j3 J a, + i)(j3 s+1 a v+ 2) ■ ■ ■ (j3 r _ia r )(j3 r ai)(j3ia 2 ) • • • (As-2« s -i)(A-i« 4 ) 

Thus, note that with the parenthesization above, we have that each subword in ux of length n has the form a,j3, 
subwords while in xu, they have the form J3(0t ( +i( m0( j r ). 
Now note that for any value of i and j, we have 

V(aiPt) = ^(oft)+T(ft) = «P(a ; - +1(modr) ) = *P(j3ya ; - +1(mod ,)). 

Thus, ux ~ M xw. 

(=>) Let ux ~ n xw. Then we can write mx = 7172 ■ • • y, and xu — 771772 ■■■ TJr where for all 1 < i,j < t, we have 

nri)=^(nj)^\yi\ = \rij\=n. 

Assume without loss of generality that \u\ > \x\. Let 1 < p < t be such that 

u = Yin-'-YpYp+i 
x = ip+iYp+2---Yt 
where 1 7p + 1 = Yp+i- Similarly, we can write 

X = r]iTJ 2 ---TJ,_p_iT}/_ ? 

where Tj/_ p Tj/i p = T] t - p . 

Thus, we have that 7172 • • • Y P Y p+l = fy-pfy-p+i • • • fy- Write 77, = 77/77/' where |7] f "| = \Y p+ \\- Similarly, write 
7 p = Y P Y P where \Y p \ = \v\[\. Then we have that 77, = Y P Y p +\ and so certainly their images under VP are the same. 
As all blocks of ux and xw have the same image, we can therefore conclude that *P(7 p +i) = ^(YpYp+i)- But clearly 
y(Y P +l) = ¥{Y P+ i) +^(Yp+i)- Therefore, we get that ^(Y p+l ) = V(Yp)- Finally, we have that *P( 7p ) = V(y p +i) 
gives that ^{Yp) = ^{Yjj+i)- Continuing in this way, we can factorize each 7 into 7/ and 7/' so that all Y have the same 
image under V P, and separately, all the Y/ have the same image under *¥. 

Thus, let t = r, s = p and a,- = 7/ and j3 ( - = Yl f° r all 1 < i < r. Then we get that u — aifi\ ■ ■ ■ a s j5 s (X s+ i and 
x = /3 s O£ i+ ij3 v+ i ■ ■ ■ a r j3r. We can then verify that the remaining conditions of the lemma hold using these definitions 
ofa;,j8;. □ 

Example 4. If x — abca and u = cbabc then xu ^3 ux (which was shown in Example |3j. Note that x and u have 
different lengths and thus cannot share an A-primitive root. 

The case where both x and u have A-primitive roots of length n is of particular interest: 

Corollary 2. Let u,x € £* with ux ^ n xu. If u has an A-primitive root of length n, then x does as well, and these 
A-roots are the same. 

Corollary|2]is analogous to the second Lyndon-Schiitzenberger theorem (see e.g., Lothaire |[T4ll or Shyr [ 18 1) which 
can be interpreted (in part) as ux — xu if and only if x and u both have the same primitive root. 

We note that the conditions of ~„ cannot be weakened to allow not all of the subwords of both u and x to have 
identical images under *P and have Theorem|9]holds, as the following example demonstrates: 

Example 5. Let ~„ be the binary relation defined on words by u ~„ x if we can write u = (X\ Ofy • ■ ■ cc m and x = 
Pilh---Pm where 

(a) for all 1 < i < m, |a,| = |j3,| = n. 

(b) for all 1 < i < m, SP(c^) = *P(j3,). 

Thus, only parallel subwords of length n are required to be permutations of one another in this definition. But note 
that if x = a and u = baa then abaa ~2 baaa. Note that no factorization of x and u of the form of Theorem[9]can exist, 
as u cannot be factored as u = j3 u'fi for any nonempty word . 
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9 Conclusions 



We have studied the formal language theoretic and combinatorial properties of Abelian primitive words. Unlike 
classical primitive words, the number of Abelian primitive words is a nontrivial combinatorial problem. On the other 
hand, we show that the set of Abelian primitive words are not context-free, unlike the long-standing open problem for 
primitive words. Future research problems include an exact enumeration of the number of Abelian primitive words of 
length n. 
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